Textline detection in degraded historical document images
نویسندگان
چکیده
This paper presents a textline detection method for degraded historical documents. Our method follows a conventional two-step procedure that the binarization is first performed and then the textlines are extracted from the binary image. In order to address the challenges in historical documents such as document degradation, structure noise, and skews, we develop new methods for the binarization and textline extraction. First, we improve the performance of binarization by detecting the non-text regions and processing only text regions. We also improve the textline detection method by extracting main textblock and compensating the skew angle and writing style. Experimental results show that the proposed method yields the state-of-the-art performance for several datasets.
منابع مشابه
Ridges Based Curled Textline Region Detection from Grayscale Camera-Captured Document Images
As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional document image processing steps can be applied on them. Previous approaches of binarization and curled text...
متن کاملInteractive degraded document enhancement and ground truth generation
Degraded documents are frequently obtained in various situations. Examples of degraded document collections include historical document depositories, document obtained in legal and security investigations, and legal and medical archives. Degraded document images are hard to to read and are hard to analyze using computerized techniques. There is hence a need for systems that are capable of enhan...
متن کاملRestoration of Degraded Historical Document Image: An Adaptive Multilayer-Information Binarization Technique
Binary image is the essential format for document image processing, and the operation of the subsequent steps depends on the quality of the binarization process. The objective of this research is to propose a new binarization method based on adaptive multilayer-information for restoration of degraded historical document images. This paper focuses on degraded Thai historical document images, whi...
متن کاملRestoration of Degraded Historical Document Image
Restoration plays a very important role in enhancing the degraded noisy images. To enhance the degraded image, the numerous algorithms have been designed. Since image processing algorithms are subjective, not all algorithms that developed will address all type of degradedness. To address specific type of problem the suitable algorithms need to be selected. In this paper a combination of spatial...
متن کاملBinarization of Document Image
Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Image and Video Processing
دوره 2017 شماره
صفحات -
تاریخ انتشار 2017